We propose a novel high-dimensional linear regression estimator: the DiscreteDantzig Selector, which minimizes the number of nonzero regression coefficientssubject to a budget on the maximal absolute correlation between the featuresand residuals. Motivated by the significant advances in integer optimizationover the past 10-15 years, we present a Mixed Integer Linear Optimization(MILO) approach to obtain certifiably optimal global solutions to thisnonconvex optimization problem. The current state of algorithmics in integeroptimization makes our proposal substantially more computationally attractivethan the least squares subset selection framework based on integer quadraticoptimization, recently proposed in [8] and the continuous nonconvex quadraticoptimization framework of [33]. We propose new discrete first-order methods,which when paired with state-of-the-art MILO solvers, lead to good solutionsfor the Discrete Dantzig Selector problem for a given computational budget. Weillustrate that our integrated approach provides globally optimal solutions insignificantly shorter computation times, when compared to off-the-shelf MILOsolvers. We demonstrate both theoretically and empirically that in a wide rangeof regimes the statistical properties of the Discrete Dantzig Selector aresuperior to those of popular $\ell_{1}$-based approaches. We illustrate thatour approach can handle problem instances with p = 10,000 features withcertifiable optimality making it a highly scalable combinatorial variableselection approach in sparse linear modeling.
展开▼